Skip to content

HDDS-15181. Added Robot smoketests for snapshot defrag on single-node compose#10200

Open
arunsarin85 wants to merge 1 commit intoapache:masterfrom
arunsarin85:HDDS-15181
Open

HDDS-15181. Added Robot smoketests for snapshot defrag on single-node compose#10200
arunsarin85 wants to merge 1 commit intoapache:masterfrom
arunsarin85:HDDS-15181

Conversation

@arunsarin85
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

  • Add snapshot/snapshot-defrag.robot: Robot Framework tests that exercise snapshot behavior while the OM is configured for periodic snapshot defrag in the unsecure compose/ozone environment.

  • Enable ozone.snapshot.defrag.service.interval only on the OM service in docker-compose.yaml

Please describe your PR in detail:
snapshot defrag basics from the user-visible side: reads, listing, diff, and delete, with waits so background defrag can run.

Scenarios in snapshot-defrag.robot (8 tests):

  1. Read snapshot data right after create - New snapshot’s .snapshot path matches the file that was stored (/etc/hosts).
  2. After waiting, snapshot and live bucket still match - Add another key on the live bucket, wait ~65s, then the first snapshot still has the old content and the live key matches /etc/passwd.
  3. Snapshot list still shows active - ozone sh snapshot ls lists the snapshot and SNAPSHOT_ACTIVE.
  4. Second snapshot sees all keys so far - Third key + second snapshot; older snapshot still only reflects the first key; newer snapshot can read all three keys.
  5. snapshot diff starts a new job - CLI diff between the two snapshots shows the usual “new job” / --get-report messaging.
  6. snapshot diff JSON lists added keys - --get-report --json completes with DONE and lists the keys that appeared after the first snapshot.
  7. Same JSON diff after another wait - Second ~65s wait, then run the JSON diff again and still expect the same key paths in the report (stability after more defrag time).
  8. Delete older snapshot, younger one still readable - Delete the first snapshot; listing shows SNAPSHOT_DELETED; keys are still readable through the second snapshot path.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-15181

How was this patch tested?

Robot / Docker: execute_robot_test scm snapshot/snapshot-defrag.robot against hadoop-ozone/dist/target/ozone-*/compose/ozone after mvn clean package -DskipTests -Pdist (correct /opt/hadoop mount).

image

robot-001.xml

Comment on lines +44 to +45
# OM-only: periodic snapshot defrag for smoketest snapshot/snapshot-defrag.robot (HDDS-15181)
OZONE-SITE.XML_ozone.snapshot.defrag.service.interval: 30s
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Snapshot Diff Json Unchanged After Another Defrag Wait
[Documentation] Run the same JSON diff again after waiting again; the reported key paths should still be there.
Sleep ${DEFRAG_WAIT_SECONDS}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can better confirm that defrag has been done by checking DB / yaml metadata.

Should contain echo '${result}' | jq '.snapshotDiffReport.diffList | .[].sourcePath' ${KEY_TWO}
Should contain echo '${result}' | jq '.snapshotDiffReport.diffList | .[].sourcePath' ${KEY_THREE}

Snapshot Diff Json Unchanged After Another Defrag Wait
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intention of this test to retrigger the snapshot diff?

By default the result of snapshot diff is persisted for a while, so a rerun without clearing previous result won't actually retrigger the diff (see ozone.om.snapshot.diff.job.report.persistent.time)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants